Sets

A set is simply a collection of objects, called its elements or members. The name of a set is typically denoted with an upper case letter () while its elements are usually denoted with lower case letters (. Sets can contain any type of object you can imagine such as numbers, letters, cars, phones, people, countries, etc. and they can contain objects of multiple types. Furthermore, since a set is itself a type of object, sets are allowed to contain other sets, too. Nevertheless, sets most often contain numbers because they are primarily used in a mathematical context.

Set Representation

There are three main ways to represent and define sets.

The descriptive form uses words to describe a set. For example, the set is the set of all odd natural numbers which are less than 12.

The set-builder form defines a set by specifying a condition that all of its members satisfies and looks like this:

The placeholder is simply there so you can use it to more easily write the condition. The | character can be read as "such that". For example, specifying the aforementioned set using set-builder notation will look like the following.

The final way to define a set is simply by listing all of its elements or listing enough of them, so that whoever is reading the definition can easily establish the pattern they follow. For example, the aforementioned set will be written as

To state that an object is a member of a particular set, we notate . To show that an object is not a member of a particular set, we use . A subset of a set is a set whose elements are all also element of . For example, if and , then is a subset of and this is denoted by . If we are unsure whether a set is a subset or is in fact equal to another set, then instead of we use .

Special Sets

  • - the empty set, which is the set with no elements and is considered to be a subset of every set
  • - the set of all natural numbers; some definitions include zero while others do not, here it is included for simplicity
  • - the set of all natural numbers with 0 explicitly included
  • - the set of all integers
  • - the set of all rationals numbers, i.e. numbers which can be represented as the division of two integers
  • - the set of all real numbers; this is the set of all the rational numbers and all the irrational numbers such as and

Set Size

The number of elements in a set is called its cardinality and is denoted by . For example, the set has a cadinality equal to 4. Some sets like this one have a finite number of elements, but others, such as the set of all natural numbers, do not. The latter are called infinite sets.

Note

If a set contains more than a single copy of one of its elements, the additional copies are not taken into account. For example, is mathematically considered the exact same set as and so the size of both sets is 5.

Set Operations

The union of two sets, denoted by , is a set which contains all the elements from and . For example,

The intersection of two sets, denoted by , is a set which contains only the elements which are found in both and . For example,

Note

If the two sets have no elements in common, then their intersection is the empty set .

The relative complement of a set with respect to another set , denoted by , is the set obtained by removing from all of its elements that are also found in . For example,

Strings

A string is a sequence of characters. The set of characters that we can choose from to make our string is called an alphabet and is usually denoted by . For example, if , then some valid strings over that alphabet will be abcd, ac,acd,c, etc.

The set of all strings with a certain length over some alphabet is denoted by . For example, the set of 2-letter strings which we can make from is

If we wanted to denote the set of all possible strings of any finite length over a given alphabet , then we would write or, for our example, . This would be the set of all strings which can be written with the letter a,b,c and d such as ab or aaccdba.

Special Strings

  • the empty string "" which has no characters and can be constructed with any alphabet
  • binary strings - strings which only contain 0s and 1s

Functions

A function takes an input and produces an output. The inputs of a function are called its arguments and can be different types of objects and so can its output. For example, a function may take in a natural number and a binary string and may output a single bit. The types of the inputs and outputs of the functions are specified by sets in its declaration which has the following syntax:

Example

Consider the following function:

We do not know what precisely this function does, but we know that it takes a binary string of length 3 and outputs a single bit - 0 or 1. Similarly, the function

takes in a natural number and a binary string of any length and outputs two binary strings of arbitrary length, too. An example of such a function would a function which splits a given binary string at the position indicated by the natural number and returns the two split parts.

The input sets are called the function's domain and codomain.

Function Definition

A function definition describes what the function outputs given a particular input and has the syntax

The expression can be a mathematical formula, it can be a sentence explaining what the function does, or it can be a mixture of both.

Example

The function which returns the square of its input would be defined as follows:

The is just an arbitrary placeholder for the argument - we could have very well used or a word or anything we would like.

The function is the function which outputs 1 if its input is a palindrome string and outputs 0 otherwise. This was an example of a definition with a sentence.

Functions can also be piecewise-defined. This is when the function does different things depending on whether its input satisfies a given condition. For example, the function can be defined as:

The absolute value function is also piecewise-defined:

Finally, a function can be specified by a table listing all its inputs and their corresponding outputs. For example,

04
217
31
426
......

This does not give us a very good idea of what the function is actually supposed to do, but it certainly is a way to define it.

Partial & Total Functions

A function need not be defined for all values in its domain. For example, the division function , or alternatively , is not defined for because one cannot divide by 0. Such functions are called partial and the set of all values for which the function is actually defined is called its natural domain. This can be seen from the following diagram for a function :

The domain is , while the natural domain is . A function which is defined for all values in its domain is called a total function.

Injection, Surjection and Bijection

These are terms which describe the relationship a function establishes between its input sets and its output sets.

An injective function, or one-to-one function, is a function which given two different inputs, will always produce two different outputs - every element in its input sets is mapped to a single element from its output sets. An example of such a function is - there are no two inputs for which . However, the function is not an injection because opposite numbers produce the same output, i.e. .

A surjective function is a function which covers its entire codomain. For example, with is a surjection because every real number can be produced from it, i.e. for every there is at least one number such that . Contrastingly, the absolute value function is not surjective because it cannot produce negative values. The subset of the codomain which contains all values which can be obtained from the function is called the function's image.

A bijective function, also known as a one-to-one map or one-to-one correspondence, is a function which is both surjective and injective, i.e. it covers its entire codomain and assigns to every element it in it, only one element from its natural domain.

Logical Operations

There are a few functions used extensively throughout cryptography and computer science. Although they are defined on single bits, every one of them can be extended to binary strings simply by applying the function on a bit-by-bit basis.

Logical NOT

The function takes a single bit and flips its value - if the bit is 0 it becomes 1 and if it is 1 it becomes zero.

aNOT(a)
01
10

Notation

The function can also be written as .

Logical AND

The function takes two bits and outputs 1 only if both bits are equal to 1.

abAND(a,b)
000
010
100
111

Notation

The function can also be written as .

Logical OR

The function takes two bits and outputs 1 if either one (or both) of them is 1.

abOR(a,b)
000
011
101
111

Notation

The function can also be written as .

Exclusive OR

The eXclusive OR function, , is similar to the logical OR operation, however it outputs 1 if either one of its inputs is 1, but not both.

abXOR(a,b)
000
011
101
110

Notation

The function can also be written as .

This function is ubiquitous in cryptography due to its four essential properties:

PropertyFormula
Commutativity
Associativity
Identity
Involution

Commutativity means that the two inputs can change places and the output would still be the same. Associativity means that, given a chain of XOR operations, the order in which they are executed is irrelevant to the final result. Identity indicates that there is a specific input, called the identity element, for which the XOR operation simply outputs the other input.

Involution is a fancy way of saying that XOR is its own inverse operation. Given the output of a XOR operation and one of its inputs, the other input can be obtained by XOR-ing the output with the known input.

XOR(a,a)

Another interesting property of is that XOR-ing a bit with itself will always produce 0. This is often used in computers to reset a register to 0.

Negligible Functions

Definition: Negligible Function

A function is negligible if for every polynomial there exists a number such that for every .

The definition itself is not that important, just remember that a negligible function approaches 0 and it does so quickly as its input approaches infinity.

Definition Breakdown

Essentially, a function is negligble if it approaches 0 as its input becomes larger and larger. That is, no matter how big a polynomial one can think of, after some input the function will always be smaller than the reciprocal of the polynomial.

The reason the function outputs a number between 0 and 1 is that such functions are usually used in the context of probabilities (as is the case here).

The reason we want the negligible function to get smaller and smaller as its input gets larger and larger is because we are using the key length for its input, so we want to say that longer keys are still more secure than shorter ones but at the same time we do not need to use massive keys. By today's standards, a reasonable negligible function would be one which is already on the order of for an input . So, not only does the function need to approach 0, but it also needs to do so fairly quickly.

Probability

When we perform an experiment such as tossing a fair coin, we obtain a certain result from it called its outcome.

Definition: Outcome of an Experiment

The outcome of an experiment is all the information about the experiment after it has been carried out.

For the experiment of the coin toss, the outcome is simply the coin's face after the toss and will be either heads () or tails (). If the coin was tossed three times, then the outcome of this experiment could be or or , etc. Therefore, different experiments can have multiple possible outcomes and the set of all possible outcomes is called the sample space for the experiment.

Definition: Sample Space

The sample space of an experiment is the set of all possible outcomes from the experiment.

Example

Consider the experiment of tossing a coin three times. Its sample space is , or equivalently if we encoded "heads" with and "tails" with .

Each outcome can be associated with a number, called its probability, which describes how likely this outcome is. However, not all outcomes in the sample space need to have the same probability. Suppose that our coin was "rigged" (maybe it weighed more on one side) and actually was more inclined to result in heads rather than tails. Then, if the coin was tossed three times, the outcome would clearly be more likely than . The way probability is assigned to the outcomes in the sample space is called a probability function.

Formal Definition: Probability Space

A probability space is a sample space with a total function such that

The function is called a probability function over the sample space .

Definition Breakdown

The probability function assigns to each possible outcome a probability value between 0 and 1. The sum of all the probabilities must be one because some outcome is guaranteed to happen. If the probabilities did not sum up to one, then there would be a chance that the experiment resulted in an outcome outside its sample space, which is impossible, since the sample space is the set of all possible outcomes.

If all outcomes from the experiment are equally likely, then they have the same probability and the probability of every outcome in the experiment's sample space is

When this is the case, the probability function is called uniform.

Events

An event can be thought of as a subset of the sample space of a given experiment which contains only the outcomes we are interested in. Then we would say that an event has occurred if the outcome after performing the experiment is in .

The probability of this event occurring (i.e. getting one of its elements as an outcome), denoted by for the sample space , is the sum of the probabilities of all outcomes in the event.

When the sample space is understood from context, this can be simply written as

Example

If we wanted to describe the event that we get "tails" an even number of times from the three coin tosses, then we would do it as . The probability of this event is the sum of the probabilities of its outcomes. We assumed a fair coin, so each outcome in the sample space has the same probability . Then,

The total number of outcomes, , is eight as we saw earlier, so

Logic with Events

For an event , we can describe the event which simply encompasses all outcomes for which does not occur. The probability of is the probability that does not happen and is equal to the following:

Given two possible events and , we can talk about both and happening or or (or both) happening. These correspond to the intersection and union of the two events, respectively. Therefore,

Random Variables

A random variable (which is a terrible misnomer, but again, mathematicians...) is a way to assign a number to every outcome in the sample space . Formally, a random variable is a function .

Example

Consider the experiment of rolling a fair die three times. Each roll has six possible outcomes - - and there are three rolls, so the sample space is . One possible random variable for this experiment would be the sum of the points from the three rolls ( in this case).

In fact, we have already seen another possible random variable which can be defined for every sample space - that's right, probability! Since the probability function assigns to every outcome in the sample space a number ranging from to (which is a subset of the real numbers), this means that it is a random variable.

Expectation Value

The expectation value of a random variable over a sample space , denoted by or , is the average value of the random variable:

The expectation value is calculated by summing all the values of the random variable for the outcomes in the sample space and then dividing it by the total number of outcomes.

Example

For the previous example where was the random variable which for each outcome was equal to the sum of the three rolls, the expectation value can be calculated as follows:

Of course, calculating this by summing up all the numbers for every outcome is tedious, but it can be circumvented using some properties of expectation.

There are two properties of the expectation value that one should be aware of.

Linearity

For every two random variables and over the same sample space , the expectation value of their sum (which is itself a random variable defined as for every ) is equal to sum of the expectation values of and .

Similarly, for every random variable and constant , the expectation value of multiplied by is equal to multiplied by the expectation value of .

Proof of Linearity

For the sum part,

For the multiplication by a constant part,

Example

Linearity can be used to calculate the expectation of the random variable which we defined for the experiment of rolling a dice three times. For each separate roll the sum is just the number of points on the die's face and the sum of three rolls is just the sum of the points from the three rolls which can also be said as "the sum of the three rolls is the sum of the sums of each separate roll". This allows us to use linearity.

If we denoted the number of points from the first, second and third roll with , respectively, then the final outcome will be written as (this is concatenation, not multiplication) and we have, by linearity,

Distributions

Random variables (which output only real numbers) are a special case of all total surjective functions which assign some value in the finite output set to every element of . The function is surjective, so is the set of all possible outputs and for every there must be at least one for which . However, that doesn't stop the function from outputting the same given two or more different . The number of times that each is obtained when executing on every is described by a probability distribution.

Formal Definition: Probability Distribution

A probability distribution over a finite set is a total probability function such that

Definition Breakdown

This definition is quite broad and does not even mention the function . This is because a probability distribution is just a way to assign a probability value to every member of a set .

When we say that we "choose a random member from a set " according to some distribution , we simply mean that the probability of choosing a particular is equal to .

The requirement that the sum of all the probabilities is equal to 1 is very intuitive - we are choosing from a finite set , so we must get some member of it.

This is all great, but how do we know what probability will assign to a given . This is where the function comes in. We say that a "distribution over a set is obtained by sampling and outputting " when the set is generated by executing on every and counting how many produce a specific output in order to define the probability function . For each then, the probability function is defined as follows:

Probability

In this way, it makes some sense to call this a probability function because tells us how likely it is that outputs when choosing an uniformly at random.